Noise robust speaker verification with delta cepstrum normalization

نویسندگان

  • Naoyuki Kanda
  • Ryu Takeda
  • Yasunari Obuchi
چکیده

This paper introduces a delta cepstrum normalization (DCN) technique for speaker verification under noisy conditions. Cepstral feature normalization techniques are widely used to mitigate spectral variations caused by various types of noise; however, little attention has been paid to normalizing delta features. A DCN technique that normalizes not only base features but also delta-features was recently proposed and showed high robustness in speech recognition and language identification. We introduce here DCN for a state-of-the-art speaker verification system that uses iVectors and probabilistic linear discriminant analysis. It is not obvious whether DCN is effective against speaker verification because DCN strongly transforms cepstral features and has possibility to distort the speaker-specific properties. We compared DCN with cepstral mean normalization (CMN), mean variance normalization (MVN), and histogram equalization (HEQ) using a NIST 2008 SRE dataset with various noise settings, and found that DCN is very effective even for speaker verification. DCN was especially effective under noisy conditions and achieved a maximum 18.5% relative error reduction compared to other competing methods. Combining verification scores from various feature normalization methods further improved the accuracy, and it achieved 9.1% and 16.4% relative error reduction under clean and noisy conditions, respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A fast approach to psychoacoustic model compensation for robust speaker recognition in additive noise

This paper addresses the problem of speaker verification in the presence of additive noise. We propose a fast implementation of Psychoacoustic Model Compensation (Psy-Comp) scheme for static features along with model domain mean and variance normalization for robust speaker recognition in noisy conditions. The proposed algorithms are validated through experiments on noise corrupted NIST-2000 sp...

متن کامل

Improved histogram-based feature compensation for robust speech recognition and unsupervised speaker adaptation

Feature compensation for noise robust speech recognition becomes more effective if normalization of time-derivative parameters is taken into account. This paper describes an implementation of Delta-Cepstrum Normalization (DCN) that runs with only minimum response time. The proposed algorithm, referred to as Recursive DCN, provides word error rate improvements comparable to conventional DCN. Sin...

متن کامل

Cosine distance features for robust speaker verification

We use similarities with people we know already as a means to enhance the speaker verification accuracy. Motivated by this, we use cosine distance similarities with a set of reference speakers, cosine distance features (CDF), to improve the performance of speaker verification systems for clean and additive noise test conditions. We used mel frequency cepstral coefficients, power normalized ceps...

متن کامل

Exploring Features for Text-dependent Speaker Verification in Distant Speech Signals

Automatic speaker verification (ASV) is the task of verifying a person’s claimed identity from his/her voice using a digital computer. The existing ASV systems perform with high accuracy of verification when the speech signal is collected close to the mouth of the speaker (< 1 ft). However, the performance of the ASV systems reduces significantly for speech signals collected at a distance from ...

متن کامل

Noise-robust speaker verification using F0 features

This paper proposes a noise-robust speaker verification method augmented by fundamental frequency (F0). The paper first describes a noise-robust F0 extraction method using the Hough transform. Then, it proposes a robust speaker verification method using multi-stream HMMs which fuse the extracted F0 and cepstral features. Experiments are conducted using fourconnected-digit utterances of Japanese...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013